Incorporating Semantic Knowledge with MRF Term Dependency Model in Medical Document Retrieval
نویسندگان
چکیده
Term dependency models are generally better than bag-ofword models, because complete concepts are often represented by multiple terms. However, without semantic knowledge, such models may introduce many false dependencies among terms, especially when the document collection is small and homogeneous(e.g. newswire documents, medical documents). The main contribution of this work is to incorporate semantic knowledge with term dependency models, so that more accurate dependency relations will be assigned to terms in the query. In this paper, experiments will be made on CLEF2013 eHealth Lab medical information retrieval data set, and the baseline term dependency model will be the popular MRF(Markov Random Field) model [1], which proves to be better than traditional independent models in general domain search. Experiment results show that, in medical document retrieval, full dependency MRF model is worse than independent model, it can be significantly improved by incorporating semantic knowledge.
منابع مشابه
The Use of Dependency Relation Graph to Enhance the Term Weighting in Question Retrieval
With the emergence of community-based question answering (cQA) services, question retrieval has become an integral part of information and knowledge acquisition. Though existing information retrieval (IR) technologies have been found to be successful for document retrieval, they are less effective for question retrieval due to the inherent characteristics of questions, which have shorter texts....
متن کاملA Markov Random Field Topic Space Model for Document Retrieval
This paper proposes a novel statistical approach to intelligent document retrieval. It seeks to offer a more structured and extensible mathematical approach to the term generalization done in the popular Latent Semantic Analysis (LSA) approach to document indexing. A Markov Random Field (MRF) is presented that captures relationships between terms and documents as probabilistic dependence assump...
متن کاملA Latent Semantic Structure Model for Text Classification
Latent Semantic Indexing (LSI) has been successfully applied to information retrieval and classification. LSI can deal with the problems of polysemy and synonymy, and can reduce noise in the raw document-term matrix. However, LSI may ignore important features for some small categories because they are not the most important features for all the document collection. In this paper, we describe a ...
متن کاملEnabled Generalized Vector Space Model to Improve Document Retrieval
This paper presents two approaches to semantic search by incorporating Linked Data annotations of documents into a Generalized Vector Space Model. One model exploits taxonomic relationships among entities in documents and queries, while the other model computes term weights based on semantic relationships within a document. We publish an evaluation dataset with annotated documents and queries a...
متن کاملEvaluating a Novel Kind of Retrieval Models Based on Relevance Decision Making in a Relevance Feedback Environment
This paper presents the results of our participation in the relevance feedback track using our novel retrieval models. These models simulate human relevance decision-making. For each document location of a query term, information from its document-context at that location determines the relevance decision outcomes there. The relevance values for all documents locations of all query terms in the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015